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Characteristics ifhich should be considered in 
ny standardized reading t^st include validity, 
, stanclardization sample, areas assessed by the test, type 

required by the child, individual versiis group ^ 
ion, time needed for administration, availability of 
forms,' scoring options available, reviewfers* comments, a 

given concerning interpretation of results aiid/qr; 
al suggestions, and groups for whom the test is or i/B not 
• These characteristics are presented in table form as 
to 1U riding tests, including the Gates-HacGinitie 
ts (Readiness) , the Metropolitan Readiness -Tests, the 
ading Hastery Tests, and the Stanford Achievement Tests, 
vidence suggests that there is a need to' develop either 
ms of reading tests or alternative scoring procedures for 
akers. (KS) , " ■ 
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Reading • tests have been used for many years to assess pupil achieve- 
ment, determine pupil readiness and to identify specific strengths and 
weaknesses. Teachers and other school .personnel often select one or more 
published tests from those available to them and administer these tests 
to children in an attempt to asseips the children'^ achievement, readi- 
ness or specific strengths an^ weaknesses. A test should be carefully 

, reviewed before a decision is made to use or reje<jjt it for a given 
purpose. The criteria which should bQ considered in reviewing -any 
reading test includes: validity, reliability, standardization sample, 

.areas assessed by the test, type of response required of child, 
individil^ versus group administration, time needed to administer test , , 
availability of equivalent forms, scoring options available, reviewers' 

• - - > . . 

comments , "information gi,ven concerning interpretation of results and/or 
instructional suggestions and groups for whom test is or is not 
appropriate. . Each is diarcussed briefly beldw: 

. -Validity answers the question, "To what extent does the test measure 
what it purfiorts to measure?" Validity can be measured in several ways. 
Content validity refers to the extent to which the test taps knowledge 

• of the cUrrlcular content and cognitive processes. Content validation 
studies are commonly carried out when a,chi^vement tests are constructed. 
Criterion-related (or predictive) validity^, tells how well a test mea- 

^^uTpes future performance on somqj criterion* ' It is particularly im- 
portant in readiness tests. Coiijgtl^uct validity tells the degr.e^ to^ 
which certain psychological traij|^^""o^^ constructs ^r.ft^-^^ . 
sented latest performance. Face Validity refers whethei^or not on 
first Impres^jb^, the test, appears to measure the intended content. 
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Reliability refers to the accuracy of a measuring instrimient , It 

answers the question, "If we measure the same set -of objects agii^in 

and again with' the same instrument, will we get the same or similar 

results?" Reliability, too, can ^e measured *in several ways. Deter- 

mining test-retest reliability involves administering the same test . *' 

twice to see if individual scores chang^ from one time to the next. ' ^ 

Determining parallel-form ieliaoility involves administering two forms 

of a test wKich are considered to be equivalent to determine whether 

scores change from one test to the other . Split-half reliability is 

determined through one test administration. The test is divided into 

equal parts (e.g., odd-even) before it is administered atid the two parts 

are treated as if they were separate tests. The higher the coefficient, 

the more -reliable the instrument. The standar'ci error of measurement 

of )a test indicat -^ow chance errors may cause variations in the scores 

which might be obta^ by an individual if the same >^st were admin- ' v 

istered numerous times. .It is desirable to tise a test with a relatively 

small standard error. * - - ^ 

To standardize a test, publishers administer it to a large group 

of students selected to be representative* of the population at the 

grade level (s) for, which the test is intend'ed. This gxoup is typically N 

. " » . ' , . -liif'' 

called the s tancj^rdization sample. - In some cases a maior effort is made 

to see that t»he sample is representative of all the student;s In the 

country. In oXh^r cases a much less represenilative sample is taken. 

It is not the size of the sample that is of primary importance, but 

rather the sample's representativeness of the group (s) the test is 

intended for. . , ' . ' ^ 

One needs to be aware of the areas assessed by a given tefst in order 

to be able to match a test to the needs of the prospective examiner. * 

Clearly, if a teacher wanted to measure silent reading comprehension, 

■., 4 ■■ 
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he should not use an oral roading test. One must also consider the 
type of response reqtilrod of the student. Does the child have diffl-' 
culty fi ling in small circles or responding verbally which will 
negatively influence the score he obtains? The setting in which the 
• test is administered (individually or in a group) and the time it takes 
to administer the test should also be considered; If the teacher wants 
to give the test as a pretest and later as a posttest, a test which has 
two or mo.re equivaleAt forms would be advantageous. The teacher alsp 
needs to consider the kinds of scores he wants to obtain and whether 
a given test yields the results in'the specified format(s). \^en 
available, reviewers' comments should be considered as well. 

Test manuals vary in the amount of information given concerning the 
interpretation of results and/or instructional suggestions. Clearly, 
it is helpful to be provided with this information. Finally, one must 
be concerned with whether a tefet is appropriate for the various dia- 
lectal groups in our schools tod,ay. 

Numerous reading tests were reviewed according to these criteria 
and are summarized in Table I- ^ * 

INSERT TABLE I ABOUT HERE , 

'As shown in Table I, none of the tests reviewed take into account . 
dialect differences. Research has shown that certain tests are 
linguistically and culturally t^iased (Hutchinson, 1972; Roberts, 1970).- 
There is some evidence in the ' literature ( Hunt, 1975) which indicates 
that Slack English-speaking subjects scored significantly higher on 
an oral reading test presented in standard English when "errors" 
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attributable to dialect were not counted as errors /than when tests 

7 

were socred accordinf^ to, the directions given in the test manual. 
Harber (1975) found that Black English-speaking subjects scored signi- 
ficantly higher on oral reading passages presented in black English 
standard orthography than on equivalent oral reading passages presented 
in standard English. Thus, there is empirical evidence which suggests 
that* there is a need for developing either dialect forms of reading 
tests or alternative scoring procedures for dialect speakers. Not to 
provide such tests or scoring procedures could lead in inaccurq^e and 
misleading reading ^evaluations and inappropriate classification and 
placement of dialect speakers. y 
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